Genomic signal processing for DNA sequence clustering

نویسندگان

  • Gerardo Mendizabal-Ruiz
  • Israel Román-Godínez
  • Sulema Torres-Ramos
  • Ricardo A. Salido-Ruiz
  • Hugo Vélez-Pérez
  • J. Alejandro Morales
چکیده

Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Signal processing approaches as novel tools for the clustering of N-acetyl-β-D-glucosaminidases

Nowadays, the clustering of proteins and enzymes in particular, are one of the most popular topics in bioinformatics. Increasing number of chitinase genes from different organisms and their sequences have beenidentified. So far, various mathematical algorithms for the clustering of chitinase genes have been used butmost of them seem to be confusing and sometimes insufficient. In the...

متن کامل

Determination of aspartic protease gene dosage in the Onchocerca volvulusgenome

Aspartic proteases are a relatively small group of enzymes which express in various nematodes including Onchocerca volvulus. An estimation of the gene copy number corresponding to the OV7A clone, which contains a cDNA insert encoding approximately two-thirds of the entire coding sequence of aspartic protease of O. volvulus, was made by slot blot analysis in a closely related species O. gibsonig...

متن کامل

DNA Sequence Visualization

This chapter introduces various visualization (i.e., graphical representation) schemes of symbolic DNA sequences, which are basically represented by character strings in conventional sequence databases. Several visualization schemes are reviewed and their characterizations are summarized for comparison. Moreover, further potential applications based on the visualized sequences are discussed. By...

متن کامل

Study of DNA Sequence Analysis Using DSP Techniques

Recently there are greater advances in bioinformatics and genomic signal processing. Digital Signal Processing (DSP) applications in genomic sequence analysis have received great attention in recent years. New methods are being developed to analyze Deoxyribonucleic acid (DNA) sequences. In order to use DSP principles to analyze DNA sequences, the DNA sequences should be converted into numeric s...

متن کامل

Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences

Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mappi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2018